=user interface
Eye tracking interfaces are potentially very nice to use. What are their advantages and limitations? What holds their implementation back, and how could those issues be resolved?
Looking at things is easy and
fast, more so than pointing at them using a mouse or trackpad, and doesn't
require extra practice.
Also, your eyes moving to look at things
happens slightly before you become consciously aware of looking at them.
When used properly in an interface, this gives the impression of the
computer anticipating your intention, not in a creepy way, but in a
satisfying way, like you're a surgeon and a nurse puts the right tool in
your hand as you need it.
But, of course, there are some problems.
When you look at something, that doesn't mean your eyes are pointed
directly at that thing; it just means that thing is in your
foveal vision.
So, even with perfect tracking, it's only possible to get within
perhaps 1° of
the intended point. Many buttons on interfaces designed for use with a mouse
are smaller than this, and of course, tracking is never perfect; 2° error is
more realistic for a consumer system.
Of course, it's possible to
design applications in ways where that's good enough. You "just" need to
redesign all your application interfaces. How much did development of
all mobile operating systems and apps cost? It happened, but it wasn't
cheap.
Also, continuing to look at the same thing doesn't mean your
eyes are stationary: constant unconscious
microsaccades are
happening. There are typically 3 to 6 fixations per second. To get good
tracking performance, cameras must have both high resolution and moderately
high framerates, which has been a relatively expensive combination, but the
costs of cameras have come down.
High resolution with high
framerate also requires good lighting, because each pixel must receive a
certain number of photons each frame. Fortunately, eyes reflect infrared
light, so if you put an IR LED next to your camera and an IR filter on the
camera, the pupil position is fairly visible. Most eye trackers do that.
In theory, you could use facial recognition with a low resolution camera
to target another camera at an eye. But a camera that mechanically moves to
track users costs money, can make noise, and can be kind of creepy for
users. A more common approach is to attach the eye tracking camera and IR
light to the user's head.
VR headsets already involve the user
wearing something, and they need high resolution and high refresh rates for
a good experience, so they seem like a perfect application for foveated
rendering, but there's an obvious question: if the display is supposed to
fill the user's vision, where do you put the camera? It is possible to split
the light by separating out IR or using half-silvered mirrors, but that
obviously costs money and increases weight. Still, eye trackers for VR
headsets are being actively pursued, not as an interface system, but for
foveated
rendering.
The eye tracking accuracy needed for foveated
rendering is substantially lower than what's needed for a good interface, so
an affordable fixed camera on a computer monitor is good enough for that. As
such, I don't think that foveated rendering is necessarily tied to VR.
Using eye tracking for camera
control can be problematic. Suppose someone is playing a FPS where aiming is
linked to camera control, and you want to use eye tracking for it. What
happens when someone looks at a target that's off the center? You could
decouple aiming from camera control, with a mouse controlling the camera and
eye tracking controlling aim, but this requires modifying the game software.
You could press a button to jump to the current eye position, but there are
2 issues with that which I already brought up, one of which you might not
have noticed. One is the inherent inaccuracy present in eye tracking as an
interface. Can you guess what the other one was?
"Your eyes moving to
look at things happens slightly before you become consciously aware of
looking at them." By the time you press a button to move to what you "are"
looking at, you might already be looking at something else. It's certainly
possible to add a slight delay, but the correct delay length can vary
somewhat. This issue obviously isn't limited to FPS games: it's present any
time you want the user to press a button to do an action with the item
they're looking at. It can be mostly solved, but high framerate and some
intelligent processing are necessary.
The bad news is that useful eye tracking interface development seems to be a fairly large and expensive project when you consider the redesign of software necessary. But that's also good news for companies like Microsoft and Apple pursuing eye tracking: the difficulties involved are a "moat" that could allow them to recoup investment in that.